Skip to content

Use bulk conversion in BCMath of BCD/CHAR where possible #14103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 1, 2024

Conversation

nielsdos
Copy link
Member

@nielsdos nielsdos commented May 1, 2024

On my i7-4790 with benchmark from #14076, on top of #14101 I obtain the following results:

before (with #14101):

1.672737121582
2.3618471622467
2.3474779129028

after (with #14101 + this):

1.5878579616547
2.0568618774414
2.0204811096191

On my i7-4790 with benchmark from php#14076, on top of php#14101 I obtain the
following results:

before (with php#14101):
```
1.672737121582
2.3618471622467
2.3474779129028
```

after (with php#14101 + this):
```
1.5878579616547
2.0568618774414
2.0204811096191
```
Copy link
Member

@SakiTakamachi SakiTakamachi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing!

Copy link
Member

@Girgias Girgias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ngl, I don't really understand what the code is doing :/ I trust it to do what it says, but I don't really understand the logic

#define SWAR_ONES (~((size_t) 0) / 0xFF)
#define SWAR_REPEAT(x) (SWAR_ONES * (x))

static char *bc_copy_and_shift_numbers(char *dest, const char *source, const char *source_end, unsigned char shift, bool add)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible add a restrict keyword for dest and source? As those APIs are only used within other C files, so we don't need to have compatibility with C++?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I'll add the restrict keyword.

I'll explain what the code does.
The intention is to copy each byte from source to dest, but subtract or add '0' to each byte.
The idea of this patch is to try to read+write 8 bytes at once, adding/subtracting '0' to each byte also in parallel.

SWAR_ONES will be of the form 0x01010101 for 32-bit or 0x0101010101010101 for 64-bit.
Example: SWAR_ONES * 0xAB will therefore be equal to 0xABABABAB for 32-bit or 0xABABABABABABABAB for 64-bit.
So in this case, for SWAR_REPEAT('0'), it will be a 32/64-bit word where each byte is equal to '0', i.e. 0x303030...

Since we know that subtract/add overflow from one byte to another can't occur, we can subtract/add with 0x303030... to the entire 4/8 bytes which will be equivalent to adding 0x30 to each byte individually.

And to be complete: SWAR stands for "SIMD Within A Register"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh okay, well could you please write this in a comment in the file? :D

@nielsdos nielsdos merged commit 0a3ccc0 into php:master May 1, 2024
7 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants